A Framework for Self-optimizing, Fault-tolerant, High Performance Bulk Data Transfers in a Heterogeneous Grid Environment
نویسندگان
چکیده
The drastic increase in the data requirements of scientific applications combined with an increasing trend towards collaborative research has resulted in the need to transfer large amounts of data among the participating sites. The general approach to transferring such large amounts of data has been to either dump data to tapes and mail them or employ scripts with an operator at each site to baby-sit the transfers to deal with failures. We introduce a framework which automates the whole process of data movement between different sites. The framework does not require any human intervention and it can recover automatically from various kinds of storage system, network, and software failures, guaranteeing completion of the transfers. The framework has sophisticated monitoring and tuning capability that increases the performance of the data transfers on the fly. The framework also generates on-the-fly visualization of the transfers making identification of problems and bottlenecks in the system simple.
منابع مشابه
Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملFault Tolerance in Grid – an Overview
-Grid computing has emerged as a distributed methodology that coordinates the resources that are spread in the heterogeneous distributed environment. The resources can be categorized as computational resources and storage resources A grid is composed of a collection of heterogeneous systems such as workstations, servers, computers that allows access to computing power, data sharing, memory use,...
متن کاملA Fault Tolerant Adaptive Method for the Scheduling of Tasks in Dynamic Grids
An essential issue in distributed high-performance computing is how to allocate efficiently the workload among the processors. This is specially important in a computational Grid where its resources are heterogeneous and dynamic. Algorithms like Quadratic Self-Scheduling (QSS) and Exponential SelfScheduling (ESS) are useful to obtain a good load balance, reducing the communication overhead. Her...
متن کاملA Decentralized Strategy for Genetic Scheduling in Heterogeneous Environments
The paper describes a solution to the key problem of ensuring high performance behavior of the Grid, namely the scheduling of activities. It presents a distributed, fault-tolerant, scalable and efficient solution for optimizing task assignment. The scheduler uses a combination of genetic algorithms and lookup services for obtaining a scalable and highly reliable optimization tool. The experimen...
متن کاملEfficient Resource Management Mechanism with Fault Tolerant Model for Computational Grids
Grid computing provides a framework and deployment environment that enables resource sharing, accessing, aggregation and management. It allows resource and coordinated use of various resources in dynamic, distributed virtual organization. The grid scheduling is responsible for resource discovery, resource selection and job assignment over a decentralized heterogeneous system. In the existing sy...
متن کامل